Transcriptomic forecasting with neural ordinary differential equations
نویسندگان
چکیده
•Neural ODE method predicts future gene expression states of single cells•We demonstrate prediction not observed in training data•Performance is validated by tracking cell cycle over 3 days•Dimensionality reduction required for estimation dynamics Single-cell transcriptomics data yield snapshots levels at the moment cells were killed prior to sequencing. The lack temporal context each hampers analysis because, under most conditions, a are expected be constant flux. Several efforts have been made infer information from single-cell RNA sequencing data, such as pseudotime and velocity methods. These techniques rely on projecting into lower-dimensional space only predicting within thus observed. We developed RNAForecaster model, which counts attempts generalize all possible rather than just those data. that generates accurate short- medium-term predictions simulated experimental datasets. technologies can uncover changes molecular underlie cellular phenotypes. However, understanding dynamic processes requires extending inferring trajectories estimating expression. To address this challenge, we neural ordinary differential-equation-based method, RNAForecaster, multiple time steps an embedding-independent manner. accurately predict transcriptomic with time. then show using metabolic labeling (scRNA-seq) constitutively dividing cells, recapitulates many during progression through 3-day period. Thus, enables short-term biological systems high-throughput datasets information. Cells constantly changing. Predicting their greater how will change naturally response perturbation. A limitation they destroy measure its state. Therefore, scRNA-seq cannot track specific trajectory individual Rather, yields statistical samples populations cells. Performing additional time-course experiments increase available about state process. Time-course designs provide substantial system interest but costly limited period measured. While do dynamically profile cell, new live-cell imaging methods emerging starting unlock potential longitudinal sampling As these develop, computational algorithms needed determine distinct occupied past estimate evolve. models both phenotypes underlying states. Trajectory inference widely applied transitions between states.1Saelens W. Cannoodt R. Todorov H. Saeys Y. comparison methods.Nat. Biotechnol. 2019; 37: 547-554https://doi.org/10.1038/s41587-019-0071-9Crossref PubMed Scopus (608) Google Scholar Building foundation pseudotime, ordering based relative distance profiles, often incorporating low-dimensional embeddings or subgroups define dynamics.1Saelens Scholar,2Trapnell C. Cacchiarelli D. Grimsby J. Pokharel P. Li S. Morse M. Lennon N.J. Livak K.J. Mikkelsen T.S. Rinn J.L. regulators fate decisions revealed pseudotemporal cells.Nat. 2014; 32: 381-386https://doi.org/10.1038/nbt.2859Crossref (2695) Scholar,3Reid J.E. Wernisch L. Pseudotime estimation: deconfounding series.Bioinformatics. 2016; 2973-2980https://doi.org/10.1093/bioinformatics/btw372Crossref (74) collected different points,2Trapnell Scholar,4Schiebinger G. Shu Tabaka Cleary B. Subramanian V. Solomon A. Gould Liu Lin Berube et al.Optimal-Transport Analysis Single-Cell Gene Expression Identifies Developmental Trajectories Reprogramming.Cell. 176: 928-943.e22https://doi.org/10.1016/j.cell.2019.01.006Abstract Full Text PDF (224) along developmental trajectory,5Chen Albergante Hsu J.Y. Lareau C.A. Lo Bosco Guan Zhou Gorban A.N. Bauer D.E. Aryee M.J. al.Single-cell reconstruction, exploration mapping omics STREAM.Nat. Commun. 10: 1903https://doi.org/10.1038/s41467-019-09670-4Crossref (118) disease states.6Campbell K.R. Yau Uncovering covariates bulk data.Nat. 2018; 9: 2442https://doi.org/10.1038/s41467-018-04696-6Crossref (51) Another form uses optimal transport order course calculating shortest path Waddington landscape,4Schiebinger differential equations (ODEs),7Tong Huang Wolf van Dijk Krishnaswamy Trajectorynet: network modeling dynamics.Proc. Mach. Learn. Res. 2020; 119: 9526-9536https://doi.org/10.48550/arxiv.2002.04461Crossref solving Schrödinger bridges via maximum likelihood,8Vargas F. Thodoroff Lamacraft Lawrence N. Solving schrödinger likelihood.Entropy. 2021; 23: 1134https://doi.org/10.3390/e23091134Crossref (5) modeled Jordan-Kinderlehrer-Otto flow learned input convex network.9Bunne Papaxanthos Krause Cuturi Proximal Optimal Transport Modeling Population Dynamics.2022Google Notably, focused solely require further extensions model values forward backward account tracing velocity, focusing investigates occurring ratio spliced unspliced transcripts.10La Manno Soldatov Zeisel Braun E. Hochgerner Petukhov Lidschreiber K. Kastriti M.E. Lönnerberg Furlan al.RNA cells.Nature. 560: 494-498https://doi.org/10.1038/s41586-018-0414-6Crossref (1395) Scholar,11Bergen Lange Peidli F.A. Theis F.J. Generalizing transient dynamical modeling.Nat. 38: 1408-1414https://doi.org/10.1038/s41587-020-0591-3Crossref (635) commonly overlay predicted steady onto Velocity extended additionally direction level translation comparing protein called acceleration.12Gorin Svensson Pachter Protein acceleration multiomics experiments.Genome Biol. 21: 39https://doi.org/10.1186/s13059-020-1945-3Crossref (26) make beyond immediate future, vector fields concept allow states.13Qiu X. Zhang Martin-Rufino J.D. Weng Hosseinzadeh Yang Pogson Hein M.Y. Hoi Joseph Min Wang al.Mapping cells.Cell. 2022; 185: 690-711.e45https://doi.org/10.1016/j.cell.2021.12.045Abstract (50) Scholar,14Chen Z. King W.C. Hwang Gerstein DeepVelo: deep field learning equations.Sci. Adv. 8: eabq3745https://doi.org/10.1126/sciadv.abq3745Crossref (1) One methods, Dynamo, suggests use variants,15Battich Beumer de Barbanson Krenning Baron C.S. Tanenbaum Clevers Oudenaarden Sequencing metabolically labeled transcripts reveals mRNA turnover strategies.Science. 367: 1151-1156https://doi.org/10.1126/science.aax3072Crossref (63) Scholar,16Qiu Q. Hu Qiu Govek K.W. Cámara P.G. Wu Massively parallel time-resolved scNT-seq.Nat. Methods. 17: 991-1001https://doi.org/10.1038/s41592-020-0935-4Crossref (45) Scholar,17Hendriks G.-J. Jung L.A. Larsson A.J.M. Andersson Forsman O. Cramer Sandberg NASC-seq monitors synthesis 3138https://doi.org/10.1038/s41467-019-11028-9Crossref (48) Scholar,18Erhard Baptista M.A.P. Krammer T. Hennig Arampatzi Jürges Saliba A.-E. Dölken scSLAM-seq core features transcription 571: 419-423https://doi.org/10.1038/s41586-019-1369-yCrossref (86) Scholar,19Cao Steemers Trapnell Shendure Sci-fate characterizes 980-988https://doi.org/10.1038/s41587-020-0480-9Crossref (43) treated modified uridine set before harvested This incorporated RNAs produced period, allows more recently distinguished older ones. fall Uniform Manifold Approximation Projection (UMAP) gene-dimensional dataset,13Qiu meaning unseen identified. measurements, ODE20Chen T.Q. Rubanova Bettencourt Duvenaud Neural Ordinary Differential Equations.arXiv. (Preprint at)https://doi.org/10.48550/arXiv.1806.07366Crossref Scholar-based RNAForecaster. In discrete network, hidden layer thought performing transformation Generally, performant complex tasks, hidden-layer transformations required. Given enough layers, produce eventually approximate continuous transformations. principle underlies ODEs. represent output nodes network. way, while fewer parameters would if layers. ODE’s computed solver. trained backpropagation, performed time, again count two points same cell. sort when standard protocols provided unlabeled profiling EU-labeled (scEU-seq).15Battich earlier point later point. compared actual train where length known, forecast real key feature distinguishing it does depend particular embedding takes space. limit training. Specifically, instead relying provides previously predictive accuracy establish ground truth regarding apply scEU-seq cell-cycle hour days after initial Altogether, analyses utility ODEs forecasting temporally resolved designed RNAForecaster21Erbe rossinerbe/RNAForecaster.jl: v0.9.1.1.Zenodo. https://doi.org/10.5281/zenodo.6773296Crossref Scholar,22Erbe FertigLab/RNAForecasterPaperCode: V1.0.Zenodo. 2023; https://doi.org/10.5281/zenodo.7942083Crossref leverage resolution learn periods. enable analysis, log-normalized matrices. source matrices, genes adjacent (Figure 1A) (see section “experimental procedures” details). matrices column, similar separately because denote first matrix t = 0 second 1. used ODE. process begins forming vector, log fill one node 1B). weights connecting layer(s) create activation function. function represents gene’s 1 matrix. mean squared error (MSE) loss opposed implementations, updated differently backpropagation solver, allowing depth memory cost. without layers maintains requirement, making computationally cheaper deep-learning alternatives.20Chen After trained, fed back 1B), transcriptional steps. repeated recursively until arbitrary n, although propagation step cause generally solver explicitly systems, evolution particularly well suited Additionally, large number performant,20Chen able solve task tractable critical thousands variable creates very parameters, demanding other architectures. Further, found series variations.20Chen illustrate performs, example sample 10 1C). By relationships was on. challenge generalizing diverse array determination limits predictability focus applications discuss. generate benchmark feasibility transcript Simulated generated BoolODE23Pratapa Jalihal A.P. Law J.N. Bharadwaj Murali T.M. Benchmarking regulatory 147-154https://doi.org/10.1038/s41592-019-0690-6Crossref (198) described procedures.” Briefly, algorithm simulates known incorporates busting contains stochastic elements. recapitulate way snapshot expression, BoolODE cell’s hundreds inclusion Here, dataset, random 10-gene genes. 100 randomly networks, simulate dataset 2,000 801 points.24Erbe Time Series Single Cells.Figshare. https://doi.org/10.6084/m9.figshare.20123804.v1Crossref begin points, Standard type validation general selected (80%) remaining set. five-hidden-layer multilayer perceptron (MLP) MLP feedforward simple architecture performance against architecture. compare held-out input. significantly outperforms (p < 1e−16) 2A), point, average MSE across simulations below 0.015 networks. range made, tested ability next 50 increases 1e−11) 2B). Error propagates quickly observe presence extreme outliers early 10, worst 50. inaccurate outlier likely poor faced outside distribution encountered phenomenon termed “catastrophic forgetting.”25French R.M. Catastrophic forgetting connectionist networks.Trends Cogn. Sci. 1999; 3: 128-135https://doi.org/10.1016/s1364-6613(99)01294-2Abstract (0) terms levels, take closer look Figure 2C. approximately MLP. 0.02 8, (median 0.42 points). ODE, contrast, better, far perfect, fit 0.054). some cases, median low 0.017 S1). fits well, producing high 1.58. understand why solutions perform substantially better solutions, examined impact gradient descent initializations. Due recursive application seed initialize affects substantially. Even exact initializations highly divergent 2D). initialized expect, given networks find local minima, somewhat different. When differences compound, leads others example, uniformly examples S2). observation may slightly Ensemble-based leveraging varied shown improve systems26Fertig E.J. Harlim Hunt B.R. comparative study 4D-VAR 4D Ensemble Kalman Filter: perfect Lorenz-96.Tellus Dyn. Meteorol. Oceanogr. 2007; 59: 96-100https://doi.org/10.1111/j.1600-0870.2006.00205.xCrossref (89) readily adaptable weather systems.27Kostelich Kuang McDaniel J.M. Moore N.Z. Martirosyan N.L. Preul M.C. Accurate uncertain models: assimilation mathematical human brain tumors.Biol. Direct. 2011; 6: 64https://doi.org/10.1186/1745-6150-6-64Crossref (28) ensemble approach RNAForecaster’s handle variation accuracy. Using evaluate each, taking final estimate. (Figures 2E S3). expected, 1, there no significant difference accuracy, 1e−6) magnitude t. Most notably, ensembles much less vulnerable cat
منابع مشابه
APPLICATION NEURAL NETWORK TO SOLVE ORDINARY DIFFERENTIAL EQUATIONS
In this paper, we introduce a hybrid approach based on neural network and optimization teqnique to solve ordinary differential equation. In proposed model we use heyperbolic secont transformation function in hiden layer of neural network part and bfgs teqnique in optimization part. In comparison with existing similar neural networks proposed model provides solutions with high accuracy. Numerica...
متن کاملTime-series forecasting using a system of ordinary differential equations
Article history: Received 7 July 2009 Received in revised form 30 August 2010 Accepted 1 September 2010
متن کاملOrdinary Differential Equations with Fractalnoisef
The diierential equation dx(t) = a(x(t); t) dZ (t) + b(x(t); t) dt for fractal-type functions Z (t) is determined via fractional calculus. Under appropriate conditions we prove existence and uniqueness of a local solution by means of its representation x(t) = h(y(t) +Z(t); t) for certain C 1-functions h and y. The method is also applied to It^ o stochastic diierential equations and leads to a g...
متن کاملapplication neural network to solve ordinary differential equations
in this paper, we introduce a hybrid approach based on neural network and optimization teqnique to solve ordinary differential equation. in proposed model we use heyperbolic secont transformation function in hiden layer of neural network part and bfgs teqnique in optimization part. in comparison with existing similar neural networks proposed model provides solutions with high accuracy. numerica...
متن کاملSummary: Ordinary Differential Equations
1 Initial Value Problem We are given a right hand side function f(t, y) with f : [t0, T ]×Rn → Rn and an initial value y0 ∈ Rn. We want to find a function y(t) with y : [t0, T ] → Rn such that y′(t) exists, is continuous and satisfies the initial value problem y′(t) = f (t, y(t)) , y(t0) = y0. (1) We assume that f(t, y) satisfies a Lipschitz condition with respect to y (at least for y with |y −...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Patterns
سال: 2023
ISSN: ['2666-3899']
DOI: https://doi.org/10.1016/j.patter.2023.100793